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ABSTRACT 

We use a complete and uniform sample of almost half a million galaxies from the 
Sloan Digital Sky Survey to characterise the distribution of stellar mass in the low- 
redshift Universe. Galaxy abundances are well determined over almost four orders of 
magnitude in stellar mass, and are reasonably but not perfectly fit by a Schechter 
function with characteristic stellar mass = 6.7 x 1O 1O M0 and with faint-end slope 
a — —1.155. For a standard cosmology and a standard stellar Initial Mass Function, 
only 3.5% of the baryons in the low-redshift Universe are locked up in stars. The pro- 
jected autocorrelation function of stellar mass is robustly and precisely determined 
for r p < 30/i Mpc. Over the range 10/i -1 kpc < r p < 10/i -1 Mpc it is extremely 
well represented by a power law. The corresponding three-dimensional autocorrela- 
tion function is £*(r) = (r/6.1/i _1 Mpc) -1 ' 84 . Relative to the dark matter, the bias 
of the stellar mass distribution is approximately constant on large scales, but varies 
by a factor of five for r p < l/i Mpc. This behaviour is approximately but not per- 
fectly reproduced by current models for galaxy formation in the concordance ACDM 
cosmology. Detailed comparison suggests that a fluctuation amplitude cr§ ~ 0.8 is 
preferred to the somewhat larger value adopted in the Millennium Simulation models 
with which we compare our data. This comparison also suggests that observations of 
stellar mass autocorrelations as a function of redshift might provide a powerful test 
for the nature of Dark Energy. 

Key words: galaxies: clusters: general - galaxies: distances and redshifts - cosmology: 
theory - dark matter - large-scale structure of Universe. 



Four hundred years ago Galileo turned his telescope to the 
Milky Way and discovered it to consist of countless faint 
stars. One hundred and fifty years later, Kant speculated 
that it might be an enormous, rotating stellar swarm, held 
together by gravity in a similar way to the Solar System, 
and that other nebulae might be similar but extremely dis- 
tant "island universes". These ideas were finally confirmed 
when Hubble established the extragalactic distance scale 
in the 1920's. Stars were accepted as the dominant form 
of matter in the Universe from this time until the 1980's, 
when new the oretical i deas suggested that the dark matter 
discovered by IZwickvl l|l93&t ) might consist of neutral, non- 
baryonic elem entary particles (jCowsik fc McClellandlfl973l ; 
|Peebleslll982h and X-ray images showed that most of the 
baryons in rich clusters are in the form of hot, intergalac- 
tic gas (|Forman fc Jones! Il982h . It now seems clear that 
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baryons are not the dominant form of matter in our Uni- 
verse, and th at stars account for only a small fraction of the 
baryons (e.g. Fukugita. Hogan. fc Peebles! 1 19981 : 1 Cole et ail 
l200ll ; [Komatsu et al.ll2009T ). Nevertheless, stars are the only 
component of the cosmic mix for which a complete and ro- 
bust census is possible. Such surveys teach us where and 
with what efficiency baryons were converted into galaxies, 
and can provide stringent constraints on our general struc- 
ture formation paradigm. 

In recent years it has become clear that stellar masses 
can be measured for galaxies in a robust way from 
multi-band photo metric measurements of their spectral en- 
ergy distributions (|Bell fc de Jong 200l1: [Blanton fc Roweid 
2007) or from combin ed photometry and spectroscopy 



(Kau ffmann et alj |2003| ). Low-mass stars contribute very 
little to the light of galaxies, so the principal uncertainty 
in these measurements comes from the stellar Initial Mass 
Function (IMF). For any particular assumed IMF, the un- 
certainties due to dust, to metallicity, and to details of the 
star-formation history turn out to be quite small provided 
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reliable photometry is available out to wavelengths of order 
1 micron. Stellar mass is then a more natural way to char- 
acterise the number of stars in a galaxy than, for example. 
B- or K-band luminosity. 

In the current paper we use a complete sample of al- 
most half a million galaxies with excellent photometry and 
accurate redshifts to study the distribution of stellar mass in 
the low redshift universe. This separates into two parts: the 
abundance of galaxies as a function of their stellar mass, and 
the clustering of stellar mass on scales larger than those of 
individual galaxies. There have been previous studies of the 
first of t hese statistics, the so-called mass function of galax- 
ies (e.g. ICole et al.ll200ll; iBell et al.ll2003l : IWang et al.ll2006l: 
iPanter et al.l l2007l ; lBaldrv et al.l 120081 ). Our results agree 
with this previous work, though with smaller statistical error 
bars because of the larger sample (systematic uncertainties 
due to the IMF remain as large as before, of course). 

The second statistic has not, to our knowledge, been 
estimated previously, although there have been a few mea- 
surements of galaxy clus tering weighted by stellar light or 
by dynamical mass (e.g. iBoerner et al.l [l989h . As we show 
the autocorrelation of stellar mass is remarkable for the ac- 
curacy with which it can be estimated from our sample, and 
for the fact that it turns out to be almost a perfect power 
law over three orders of magnitude in spatial scale. The near 
power-law behavio ur of galaxy correlation s was emphasised 
in early work (e.g. iDavis fc Peebledfl983l ) and has been ex- 
amined in some detai l in previous studies with Sloan Digital 
Sky S urvey data (e.g. lZehavi et al.ll2004l2005l ; lM"asiedi et all 
2006). The latter noted that deviations from a pure power 
law are detected at high significance in almost all cases. In 
contrast, our stellar mass autocorrelation does not deviate 
from the best-fit power law by more than 12.5% for sepa- 
rations between 10 kpc and 10 Mpc, a behaviour which we 
will show to be partially but not perfectly reproduced by 
existing galaxy formation models. 

The structure of our paper is as follows. In Section 2 we 
discuss the observational dataset we analyse and the theoret- 
ical models with which we compare it. Sections 3 and 4 then 
present our results for the stellar mass function of galaxies 
and for the autocorrelation function of stellar mass, respec- 
tively. A concluding section discusses these results, compares 
them with model predictions, and suggests that the shape of 
the mass autocorrelation function might provide a means to 
estimate how the cosmic scale factor and the linear growth 
factor depend on redshift, and hence to constrain the prop- 
erties of Dark Energy. 



2 DATA 

2.1 SDSS and NYU-VAGC 

This study is based on the final data release (DR7; 
lAbazaiian et al.l l2008h of the Sloan Digital Sky Survey 
(SDSS; York et all 2000) . This contains images of a quarter 
of th e sky obtained using a drift-sca n camera llGunn et al.l 
19981) in the u, q, r, i, z bands (|Fukugita et al.l 1 19961 : 
Smith et af] l2002l ; llvezic et all I2004T ). together with spec- 



The imaging data are photometrically llHogg et alj 12001 ; 
iTucker et all 120061 ) and astrometricallv i|Pier et all 12003 ) 



calibrated, and were use d to select spectrosc opic targets for 
the main galaxy sa mple (IStrauss et al. 20021) . the luminous 
red gala xy sample ([ Eisenstei n et al.ll200ll ). and the quasar 
sample (|Richards et al.l [2002) . Spectroscopic fibres are as- 
signed to the targets using an ef ficient tiling algorith m de- 
signed to optimise completeness l|Blanton et al. 20031). The 
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detail s of the survey strategy can be found in 
(2000) and an overview of the data pipeline s and products is 
provi ded in the Early Data Release paper (IStoughton et all 
2002). More details on the photometric pipeline can be found 



Lupton et al. (12001 ) and on the spectroscopic pipeline in 

ISubbaRao et al. I (|2002h . 



tra of almost a million objects obtained with a fibre- 
fed double spectrograph (|Gunn et al.l 120061 ) . Both instru- 
ments were mounted on a special-purpose 2.5 meter tele- 
scope (|Gunn et alj 120061 ) at Apache Point Observatory. 



For this paper we take data from Sample dr72 of 
the New York University Value Added Catalogue (NYU- 
VAGCfl This is an u pdate of the catalogue constructed by 
iBlantonet ail (|2005bl ) and is based on the full SDSS/DR7 
data. Starting from Sample dr72, we construct a magnitude- 
limited sample of galaxies with r ^ 17.6 and spectroscopi- 
cally measured redshifts in the range 0.001 < z < 0.5. Here 
r is the r-band Petrosian apparent magnitude, corrected for 
Galactic extinction, and the apparent magnitude limit is 
chosen in order to get a sample that is uniform and com- 
plete over the entire area of the survey. We also restrict 
ourselves to galaxies located in the main contiguous area 
of the survey in the northern Galactic cap, excluding the 
three survey strips in the southern cap (about 10% of the 
full survey area). These restrictions results in a final sample 
of 486,840 galaxies. 

In addition to the magnitudes, redshifts and positions of 
the galaxies, the NYU-VAGC provides several other quan- 
tities which are needed in our analysis. The first is a stellar 
mass for each galaxy, which is based on its redshift and the 
five-band SDSS photome tric data, as described in detail in 
iBlanton fc Row eis (2007). This estimate corrects implicitly 
for dust and as sumes an universal Initial Mass Function of 
IChabrierl (|2003l ) form. As we demonstrate in Appendix A, 
once all estimates are adapted to assume the same IMF, the 
Blanton-Roweis masses agree quite well wi th those obtained 
from the simple, single-colour estimator of Bell et al. (|2003l ) 
and also with those derived bv lKauffmann et al.l l|2003l) from 
a combination of SDSS photometry and spectroscopy. Given 
the very large sample provided by the SDSS, sampling fluc- 
tuations and "cosmic variance" are small. Uncertainties in 
the mass estimation procedure dominate the systematic er- 
ror budget for most of the results we present below. Ap- 
pendix A shows that such uncertainties primarily affect the 
overall stellar mass scale, as do uncertainties in the IMF it- 
self, as long as it is assumed universal. Results that depend 
only on the relative stellar masses of galaxies (for exam- 
ple, the stellar mass correlation function) are therefore much 
more weakly affected than those that depend directly on the 
mass scale (for example, the mean stellar mass density of the 
Universe). 

The NYU-VAGC also provides the necessary informa- 
tion to correct for incompleteness in our spectroscopic sam- 
ple. In particular, we use a mask which shows which areas 
of the sky have been targeted, and which have not, either 
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because they are outside the survey boundary, because they 
contain a bright confusing source, or because observing con- 
ditions were too poor to obtain all the required data. This 
mask defines the effective area of the survey on the sky, 
which is 6437 square degrees for the sample we use here. 
This survey area is divided into a large number of smaller 
subareas for each of which the NYU-VAGC lists a spectro- 
scopic completeness f sp . This is defined as the fraction of 
the photometrically defined target galaxies in the subarea 
for which usable spectra were obtained. The average over 
our sample galaxies is {f sp ) = 0.92. Within each subarea 
the galaxies with spectra can be assumed to be a random 
sample of all possible targets, with the important exception 
that fibres cannot be closer than 55 arsec in a single spectro- 
scopic exposure, so that at most one fibre can be placed on a 
galaxy in a pair or group with smaller angular size than this. 
(More fibres may be assigned to such clumps if they hap- 
pen to lie in the overlap region between two or more spec- 
troscopic observations.) It is important to correct for such 
"fibre collisions" when measuring clusterin g. As discussed in 
mor e detail b e low, w e use the procedures of Li et al.l l|2006fj ) 
and iLi et al.l (|2007h for this purpose. These are based on 
comparing pair counts as a function of angular separation 
in the spectroscopic sample and in its parent photometric 
sample. General incompleteness is dealt with by weighting 
each galaxy by l/f S p in all statistical analyses. 

A final observational issue is that the SDSS photomet- 
ric catalogue from which our spectroscopic galaxy sample 
is drawn is incomple te for low surface brightness galaxies 
IIBlanton et al . 2005a). We discuss this in Appen dix B, based 
on the recent analysis by iBaldrv et al.l (|2008l ) . concluding 
that for our purposes the effects are negligible except possi- 
bly at the very lowest stellar masses we study. 



2.2 Millennium Simulation, semi-analytic galaxy 
catalogue, and mock redshift surveys 

We have constructed a set of 20 mock SD SS galaxy cat- 
alogu es from the Millennium Simulation (|Springel et al.l 
l2005h using both the sky mask and the magnitude and red- 
shift limits of our real SDSS sample. The Millennium Simu- 
lation uses 10 10 particles to follow the dark matter distribu- 
tion in a cubic region 500/i" 1 Mpc on a side. The cosmolog- 
ical parameters assumed are Q m = 0.25, Qa = 0.75, n — 1, 
erg = 0.9 and h = 0.73. Galaxy formation within the evolv- 
ing dark matter distribution is simulated in postprocessing 
using semi-analytic methods tuned to give a good represen- 
tation of the observed low-redshift galaxy population. Our 
m ock catalogues are b ased on the galaxy formation model 
of lCroton et I all (|2006l ) and are constructed from the pub- 
licly av ailabl e z = data us ing the methodology of lLi et all 
( 2006B) and ILi et all (|2007l 1. These mock catalogues allow 
us to derive realistic error estimates for the statistics we 
measure, including both sampling and cosmic variance un- 
certainties. 

We also use galaxy data from the Millennium Simu- 
lation archival to compare and contrast predictions for the 
total mass and stellar mass correlation functions at z = 0.07 
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Figure 1. Stellar mass function for galaxies in the SDSS/DR7 
(symbols). Error bars show the la scatter between 20 mock cat- 
alogues constructed from the Millennium Simulation using the 
same sky mask and magnitude and redshift limits as for the real 
sample. The dashed line is the best fit single Schechter function, 
while the solid line is a fit based on three disjoint Schechter func- 
tions; its parameters are listed in Table [3] The lower panel shows 
the logarithmic deviation between the data and each of these 
models. For clarity the error bars are plotted on the more accu- 
rate, piece-wise fit only. 



and at higher reds hift. These data are based on the galaxy 
formation model of |Pe Lucia fc Blaizotl (|2007i ). 



3 THE STELLAR MASS FUNCTION OF 
GALAXIES 

A first statistic which we can estimate from the data we 
have available is the abundance of galaxies as a function of 
their stellar mass. For each observed galaxy i we define the 
quantity z max ,i to be the maximum redshift at which the 
observed galaxy would satisfy the apparent magnitude limit 
of our sample r ^ 17.6. Evolutionary and K-correc tions are 
include d wh en calcu l ating z ma x,i as described by ILi et alj 
|2006a|) and lLi et all l|2007l ). This allows us to define V max ,i 
for the galaxy in question as the total comoving volume of 
the survey out to redshift z max ,i. The stellar mass function 
can then be estimated as 



$(m») Am* = y^^fs P ,iV m ax,i) 



(1) 
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where the sum extends over all sample galaxies with stellar 
mass in the range m„ ± 0.5Am„. 

In Figure [1] we show the stellar mass function deter- 
mined in this way for the galaxies in our sample. The er- 
ror bars are estimated from the scatter among the mass 
functions of our 20 mock catalogues. As already noted by 
ICroton et all (|200rj l the mass function of this model agrees 
reasonably well with observation, and indeed the number 
of galaxies in the real catalogue lies well within the scatter 
of the numbers found for the 20 mock catalogues. The er- 
ror bars should thus account correctly for the uncertainties 
due to sampling and cosmic variance, but they do not in- 
clude systematic uncertainties in the NYU-VAGC estimates 
of stellar masses. Appendix A shows that these may be rea- 
sonably represented by a ±0.1dex systematic uncertainty in 
the overall mass scale. Note that the error bars on neighbor- 
ing points are strongly correlated. 

The mass function of FigureQJ is in good a greement with 
estimates from the 2dFGRS (ICole et al.ll2001f) and from ear- 
lier releases of the SPSS fe.g. iBell et alj|2003 ; IWang et al] 
l200fj ; IPanter et ai]|2007l : iBaldrv et alj|2008fl. In Appendix B 
we present an explicit comparison with lBaldrv et all (|2008l ) 
which allows us to assess the effect of a small but significant 
systematic, the incompleteness of the SDSS sample at low 
mass and low surface brightness. This causes a slight under- 
estimate of abundances below 10 8,5 Mq The purely statisti- 
cal error bars on our mass function are smaller than in earlier 
work because of the substantially larger size of the sample. 
It is well determined all the way from 1O 8 M0 up to almost 
10 12 M (T). A fit with a single Schechter function (Schechtcr 
ll976T l is not fully consistent with the data. In particular, it 
significantly underpredicts the abundance of the most mas- 
sive galaxies. Nevertheless, it provides a reasonable and sim- 
ple representation of our results. The parameters we obtain, 
log lo (m«//i- 2 M ) = 10.525 ± 0.005, a = -1.155 ± 0.008 
and $* = 0.0083 ± 0.0002 /i 3 Mpc" 3 are similar to those 
found in earlier studies. (The errors here are approximate 
la uncertainties in each parameter marginalised over the 
uncertainties in the other parameters.) Our mass function 
data can be considerably better represented by fitting three 
different Schechter functions (i.e. with different parameters) 
over three disjoint mass ranges. This representation is also 
plotted on top of the data in Figure [T] and its parameters 
are listed in Table [3] It provides a compact and accurate 
summary of our results. 

Another useful representation of the stellar mass func- 
tion is in terms of the percentage points of the cumulative 
stellar mass distribution. According to our results, half of all 
the stellar mass in galaxies is in objects with individual stel- 
lar mass greater than 1.86 x 10 10 h~ 2 Mq. The corresponding 
5%, 10%, 20%, 80%, 90% and 95% points are 1.03, 2.47, 5.86, 
44.2, 65.8 and 89.9 x W 9 h~ 2 M Q respectively. (We have used 
our triple Schechter fit to extrapolate the low-mass end of 
the mass function when calculating these numbers.) Thus 
60% of all stars are in galaxies with stellar mass within a 
factor of 2.75 of 1.6 x 10 10 h~ 2 M e which is about half the 
stellar mass of the Milky Way. A related characteristic mass 
which will be of interest below is the mass-weighted mean 
stellar mass, which can also be thought of as the expected 
stellar mass of the host of a randomly chosen star. This is 
M* = 2.85 x 10 10 h~ 2 M Q and is close to the stellar mass of 
the Milky Way. 



A straightforward integration of our stellar mass func- 
tion gives the mean comoving stellar mass density of the 
low-redshift Universe (at z ~ 0.07, see below). This is 
p* = 3.14 ± 0.10 x 10 8 /iM Q /Mpc 3 , where the error bar is 
again derived from the scatter among our 20 mock cata- 
logues and so accounts for sampling and cosmic variance 
effects, but not for systematic errors in the mass determina- 
tions of indi vidual galaxies. In the standard concordance cos- 
mology (e.g. iKomatsu et al-lfeoOGl ) only 3.5% of the baryons 
in the low-redshift Universe are locked up in stars. Clearly, 
galaxy formation has been a very inefficient process. 



4 STELLAR MASS CORRELATION 
FUNCTIONS 

To obtain a reliable estimate of the clustering of stellar mass, 
the observed sample must be compared with a "random sam- 
ple" which is unclustered but fills the same region of the sky 
and has the same, stellar mass-dependent redshift distribu- 
tion. We construct our random sam ple from t he obs erved 
sample itself, as described in detail in iLi et all (|2006al ). For 
each real galaxy we generate 10 sky positions at random 
within our DR7 mask, and we assign to each of them the 
properties of the real galaxy, in particular, its values of red- 
shift, stellar mass, V m ax and f spec . The validity of the result- 
ing random sample rests on two requirements: 1) the survey 
area should be large enough that structures in the real sam- 
ple are wiped out by randomising in angle; 2) the effective 
depth of the survey must not vary from region to region. 
Both are true to good accuracy for our sample, which cov- 
ers > 6000 deg 2 , is complete down to r = 17.6 and is little 
affected by foreground dust over the entire survey region. 
Extensive tests show that random samples constructed in 
this way produce indisti nguishable resu lts from those using 
the traditional method l|Li et al.ll2006al ). The advantage of 
this technique in the current application is that it is guaran- 
teed to maintain the complex relation between stellar mass 
and photometric properties (and thus sample selection cri- 
teria) which holds in the real data. 

We begin by estimating the redshift-space, stellar mass 
correl ation function, £*(r v ,Tr ) using an appropriate version 
of the lLandv fc Sza lav (1993) estimator: 

DP* (r p , tt) - 2DR*(r p , tt) + RR" (r p , tt) 
KR*{r p ,ir) 

where the data-data, data-random and random-random pair 
counts are weighted as follows: 



DD*(r p ,n) = ^ 



m* jtn, 



fcoll,ij fs P ec,ifs P ec,j Vij 



DR*(r p ,n) = ]T 



(ij)GDfl(r p ,7r) 



RR*(r p ,Tv) 



E 



f spec, if spec, j Vij 
fspec,i fspec, j Vij 



(3) 
(4) 
(5) 



where the sums are over all pairs of the relevant type 
(DD, DR or RR) with separations in the two-dimensional 
separation bin labelled by r p and tt, the pair separations 
perpendicular and parallel to the line of sight. Note that in 
the DD and RR sums each pair appears twice (as and 
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Table 1. Parameters of a triple Schechter function fit to the stellar mass function of SDSS galaxies 



mass range 
(h- 2 M Q ) 


(/i 3 Mpc" 3 log 10 M" 1 ) 


a 


logio M * 
(h~ 2 M @ ) 


8.00 < bgio M < 9.33 


0.0146(5) 


-1.13(09) 


9.61(24) 


9.33 < log 10 M < 10.67 


0.0132(7) 


-0.90(04) 


10.37(02) 


10.67 < log 10 M < 12.00 


0.0044(6) 


-1.99(18) 


10.71(04) 
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Figure 2. The projected stellar mass autocorrelation function in 
the SDSS is plotted as triangles in the top panel, and is com- 
pared to the projected autocorrelation function of dark mat- 
ter at z = 0.07 in the Millennium Simulation (the solid line). 
The dashed line is a power-law fit to SDSS data over the range 
10h~ 1 kpc < r p < 10/i -1 Mpc and corresponds to a three- 
dimensional autocorrelation function £*(r) = (r/ro)~ 184 with 
ro = 6.1/i — x Mpc. The ratio between the stellar mass and dark 
matter projected autocorrelation functions is shown logarithmi- 
cally in the middle panel and linearly in the bottom panel. Error 
estimates in all three panels come from the scatter among sim- 
ilarly estimated correlations for 20 mock galaxy catalogues con- 
structed from the Millennium Simulation using the same selection 
criteria as the real sample. See the text for details. 
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Figure 3. Relative errors in our estimate of the projected stellar 
mass autocorrelation function based on the scatter between esti- 
mates from 100 bootstrap resamplings of the SDSS data (dashed 
line) and between estimates from 20 mock galaxy catalogues con- 
structed from the Millennium Simulation using the same selection 
criteria as for the real SDSS sample (solid line). 



(j, i)). In these expressions, m*,j and m*,j are the stellar 
masses of the members of each pair, f S pec,i and fspecj are 
their associated spectroscopic completeness fractions, and 



Vn 



min(l/m 



is the volume over which both 



galaxies would be included in the sample. To reduce sam- 
pling noise, random samples are usually constructed with 
many more particles than the real sample. To normalise ap- 
propriately, RR* needs to be multiplied by (N g /N r ) 2 and 
DR* by Ng/N r where N g and N r are the numbers of galax- 
ies in the real and random samples, respectively. In our case, 
N r = 10 X Ng. 

The final weight in the above equations is the factor 
fcoii,ij which appears in the data-data counts only. This is 
a function of the angular separation Oij of the two galaxies, 
and is defined as the fraction of pairs of angular separa- 
tion 9 which are missing from our spectroscopic sample as 
a direct consequence of the fibre collis ion problem. We esti- 
mate this fract ion in the same way as iLi et all l|2006bl ) and 
ILi et alJ (|2007T >. We calculate angular correlation functions 
w{0) for the spectroscopic sample and for its parent photo- 
metric sample, and we then define 



fcoii{e) = [1 + w sp (6)]/[l + w ph (9)]. 



(6) 
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Figure 4. Redshift distributions of contributions to the galaxy 
count (dotted line), the cosmic stellar mass density estimate 
(dashed line), and the stellar mass autocorrelation function (solid 
line) of our SDSS sample. All curves are normalised to have the 
same integral. 



Detaile d tests of this procedure can be found in iLi et al.l 
(|2006bh . 

Rather than analysing these two-dimensional correla- 
tion estimates, we integrate over the line-of-sight separation 
7r to obtain an estimate of the projected stellar mass cor- 
relation function, w*(r p ). This function is independent of 
redshift-space distortions and is related by a simple integral 
transform to the three-dimensional spatial autocorrelation 
function for stellar mass £*(r). Thus we take 



£*(r P ,ir)d-K 



(7) 



where we choose Tv ma x = 40/i _1 Mpc as the outer limit for 
the depth integration (in order to limit noise from distant 
uncorrelated regions) so that the summation for computing 
Wp(r p ) runs from 7Ti = —39.5 h _1 Mpc to 7T8o = 39.5 h _1 
Mpc, given that we use bins of width Ant — 1 h _1 Mpc. 

The measurements of w p (r p ) obtained in this way are 
plotted as triangles in the top panel of Figure [2] Error bars 
are estimated from the scatter between estimates of w p (r p ) 
made by applying exactly the same procedures to our 20 
mock SDSS samples. We have also estimated error bars 
by bootstrap resampling of the SDSS data themselves. As 
shown in Figure |3j these two methods produce consistent re- 
sults on small scales (< 200/i _1 kpc) but at larger separations 
the bootstrap estimates are consistently smaller than those 
obtained from the mock catalogues. This is because the for- 
mer do not account properly for cosmic variance, which is 
the primary source of uncertainty on large scales. Over the 
range 10/i _1 kpc < r p < 10/i -1 Mpc the measurements have 
errors below 10% and are remarkably well approximated by 
a power law. The rms scatter around the power law shown in 
the figure is only 6.9% over this range. At larger separations 
w p (r p ) rolls off below the extrapolation of the power law. 



Figure 5. Distribution across galactic stellar mass of contribu- 
tions to our estimates of cosmic stellar mass density and of the 
autocorrelation function for stellar mass. The dotted line shows 
the contribution to the cosmic stellar mass density coming from 
galaxies in each bin of log m* . The other two curves show contri- 
butions to our estimate of the stellar mass autocorrelation func- 
tion binned as a function of the stellar mass of the most massive 
(solid line) and of the least massive (dashed line) galaxy in each 
pair. The three curves are normalised to have the same integral. 



A similar result with a slightly but significantly shallo wer 
power law index was obtained bv lHawkins et aD ()2003l ) for 
galaxies (rather than stellar mass) in the 2dF Galaxy Red- 
shift Survey, and only quite subtle deviations from power 
laws are see n in earlier SPSS measurements of galaxy corre- 
lations (e.g. IZehavi et af] 12004 120051 ; iMasiedi et atl 120061 ) 
This in turn ech oes the very first galax y correlat i on re - 
sult s obtained by iTotsuii fc Kiharal i|l969l ), iPeebles! (|l974 ) 
and lDavis fc Peebles! l| 19831 ). 

For comparison, Figure [2] also shows predictions for the 



corresponding projected 2PCF for dark matter, 



obtained from the z = 0.065 snapshot of the Millennium 
Simulation (as we show below, this is the appropriate mean 
depth for our SDSS measurement). A maximum line-of- 
sight depth of 40 ft -1 Mpc was adopted when computing this 
statistic in order to mimic our SDSS procedures. The result, 
plotted as a solid line in the top panel of the figure, shows 
much more pronounced features than stellar mass correla- 
tion function. The ratio between them is shown in the lower 
two panels, and can be thought of as an estimate of the 
scale-dependent bias between stars and dark matter. Our 
results are consistent with bias being independent of scale 
at r p > 1.5/i _1 Mpc, but the scale-dependence at smaller 
separations is strong, with a total range of a factor of 5. 

In order to interpret the stellar mass autocorrelation 
function of Figure[2]it is helpful to see how the contributions 
to our estimate are distributed in redshift and across galax- 
ies of differing individual stellar mass. Figure [4] illustrates 
the distribution in redshift. The solid curve histograms con- 
tributions to Equ. [3] from pairs with r p < 1.0ft _1 Mpc 
and | -7T | < 40/i _1 Mpc. The median of this distribution is 



© 2008 RAS, MNRAS 000.ITHT21 



Distribution of stellar mass 7 



z = 0.067 and the 10 and 90% points are z = 0.025 and 0.12, 
respectively. (These results are very insensitive to the par- 
ticular r p range chosen.) For comparison, the dotted line is a 
redshift histogram for all galaxies in our spectroscopic sam- 
ple and has median at z = 0.088, and 10 and 90% points at 
z = 0.033 and 0.16, while the dashed line is the histogram of 
contributions to our estimate of the cosmic stellar mass den- 
sity (i.e. m ti i/(f spf , Ci iV rna x,i) for galaxy i at redshift Zi) with 
median at z = 0.080, and 10 and 90% points at z = 0.026 
and 0.16. The autocorrelation function is dominated by con- 
tributions from a narrower redshift range and peaking at 
lower redshift than is the parent sample or the stellar den- 
sity estimate we derived from it. It is also remarkable that 
although our sample includes almost half a million galaxies 
and covers a sixth of the sky, the histograms of Figure [3] 
still show strong features which reflect large-scale structure. 
Our mock catalogues show that features at this level, al- 
though striking, are expected and do not appreciably affect 
our estimate of Wp(r p ). 

Figure [5] shows contributions to our estimate of w p (r p ) 
histogrammed as a function of the stellar mass of the most 
massive (solid line) and of the least massive (dashed line) 
galaxy in each pair. As was the case for the redshift distri- 
butions, we find these curves to be almost independent of 
the r p range used; here we again use pairs with r p < 1.0h~ 
and | -7T | < 40/i~ 1 Mpc. For comparison, we also show a his- 
togram of contributions to our estimate of the cosmic stellar 
mass density as a function of the stellar mass of the indi- 
vidual galaxies. This is effectively the stellar mass function 
of Figure [T] multiplied by m* and plotted on a linear scale. 
In all three cases, the dominant contributions come from a 
relatively narrow range of stellar mass centred quite close to 
the mass of the Milky Way. The autocorrelations are dom- 
inated by contributions from galaxies in an even narrower 
mass range than those dominating the cosmic stellar mass 
density, by pairs quite similar to the Local Group. 



5 DISCUSSION 

Our estimates of the mean stellar mass density and of the 
stellar mass autocorrelation function allow us to calculate 
the average stellar mass within distance r of a randomly 
chosen star. This is 

M.(r) = M«+47rp, / [1 + C(r')V' 2 dr' (8) 
Jo 

= M*[l + (r^/T^pc) 1 16 + (r/2.8/i _:L Mpc) 3 ], 

where the constant M* = 2.85 x 10 10 /i~ 2 M Q , the mean stel- 
lar mass of the chosen star's own galaxy, was calculated 
from the stellar mass function in section 3. The second and 
third terms account for stars in other galaxies. It is inter- 
esting that these terms become comparable to M„ only at 
350/i _1 kpc. This is at least thirty times the size of the stellar 
component of a typical galaxy, a factor which quantifies how 
much dissipative effects have condensed the visible compo- 
nents of galaxies with respect to the larger scale dissipation- 
less hierarchy. 

It is striking that our measured stellar-mass autocorre- 
lation function is very well fit by a power law over about 
three orders of magnitude in spatial scale. This behaviour 



breaks down dramatically on smaller scales where "one- 
galaxy" contributions cause £* to jump in amplitude by 
about two orders of magnitude (see equation (jSJ) but also on 
larger scales where we detect the roll-down in amplitude ex- 
pected in ACDM universes. When it was first seen in galaxy 
autocorrelations, such power-law behaviour was interpreted 
as evidence for the sc ale-free natur e of hierarchical cluster- 
ing under gravity fe.g.lPeeblcs 1980). High-quality numerical 
simulations have improved our understanding of this pro- 
cess considerably, showing that precise power-law behaviour 
should not be expected, either on highly nonlinear scales or 
in the transition between linear and nonlinear scales. Fig. [2] 
shows dark matter correlations in the Millennium Simula- 
tion to depart substantially from power-law behaviour in 
both these regimes. It is thus surprising that the observed 
stellar mass autocorrelation is an excellent power law be- 
tween 10/i _1 kpc and 10/i -1 Mpc. In our standard structure 
formation model, this must be seen as a coincidence. Differ- 
ent processes are required to cause convergence towards the 
observed power law on different scales. 

We use the Millennium Simulation to explore this issue 
further in Figure[6] This compares projected stellar and dark 
matter autocorrelation functi ons at z = 0.07, 1.0 and 3.0 us- 
ing the semi-analytic model of lDe Lucia fc Blaizoj l|2007h to 
specify the stellar masses of the "galaxies" . The upper panel 
shows the dark and stellar mass autocorrelations separately, 
while the lower panel shows their ratio, the "bias factor", 
as a function of scale. The well-known result that galaxy 
correlations are predicted to evolve much more weakly than 
dark matter correlations is very clear in this plot. More in- 
teresting in the current context is the fact that the predicted 
2 = stellar mass correlations are much closer to a power 
law for r p < 10/i~ 1 Mpc than are those for the dark mat- 
ter, although the deviations are significantly larger than in 
the real data of Fig. [5] The two can be compared in the 
lower panel of Fig. |6]where the SDSS bias data are shown as 
symbols with error bars. Although the model reproduces the 
"two-halo" part of the observed function very well, it over- 
predicts the "one-halo" part by about 50% and this leads to 
the bulge above a power law which is visible in the upper 
panel. At higher redshifts the model stellar mass autocorre- 
lations maintain their power-law behaviour on small scales, 
becoming even steeper than at z = 0. By z — 3 the transition 
between one and two halo terms has become very obvious. 
This reinforces the conclusion that the remarkably precise 
power law of Fig. [2] is just a coincidence. 

Another interesting feature of the model bias curves in 
Fig. [6] is the fact that they are smooth and almost constant 
over the range 200/i -1 kpc < r p < 15h~ Mpc. They show 
at most a very weak feature at the transition between the 
1-halo and 2-halo regimes, despite the fact that the pro- 
jected correlation functions from which they were derived 
show such features quite clearly. In contrast, when we di- 
vide our featureless observed w p (r p ) by the corresponding 
z = 0.07 function for dark matter in the Millennium Simu- 
lation, the resulting bias curve shows an obvious step at the 
l-halo/2-halo transition which reflects the marked change in 
slope of the dark matter correlations at this point (see the 
bottom panel of Fig. [2J ■ This suggests that the amplitude of 
mass fluctuations is too high in the Millennium Simulation, 
and that the shape of our stellar mass correlation function 
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Figure 6. Projected autocorrelation functions at z = 0.07,1.0 
and 3.0 for the stellar and for the dark matte r mass in the Millen- 
nium Simulation. The semi-analytic model of lDe Lucia fc Blaizotl 
is used to specify the positions, velocities and stellar 
masses of the galaxies. The upper panel shows results for the 
stellar mass (symbols) and for the dark matter (lines) separately. 
Lines in the lower panel show the "bias", i.e. the ratio at each 
redshift of the stellar mass and dark matter functions in the up- 
per panel. The symbols in the lower panel repeat the SDSS bias 
data from Figuref3]and should be compared with the model curve 
shown as a solid line. 



can be used to estimate the value of the fluctuation ampli- 
tude parameter a$. 

We illustrate this possibility in Fig. [7] where we use the 
projected dark matter correlations measured in the Millen- 
nium Simulation at redshifts of 0.32, 0.62 and 0.99 to repre- 
sent those expected at z = 0.07 in cosmologies with identical 
parameters except that as is reduced to 0.8, 0.7 and 0.6 re- 
spective!^. We derive bias functions from our SDSS stellar 
mass correlations using these three functions in addition to 
the original z — 0.07 data, and we compare their shapes to 
those of model bias functions at z — 0.07, 1 and 3 taken from 
Fig. [U For (78 = 0.9 the l-halo/2-halo transition feature is 
clear and is much larger than any feature seen in the models. 
For as =0.7 the feature has reversed sign and the inferred 
bias function has a much more marked overall slope than 
any of the models. The "Goldilocks" solution appears to be 

3 This scaling assumes that the nonlinear dark matter power 
spectrum depends on the shape of the linear power spectrum and 
its extrapolated amplitude, but not on other parameters such as 
Q m or Qa- This approximation is quite good over the range of 
scales relevant here, and is certainly good enough for the present, 
essentially qualitative argument. 
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Figure 7. Bias as a function of scale for the SDSS stellar mass dis- 
tribution taking z = 0.07 dark matter correlations corresponding 
(approximately) to the Millennium Simulation cosmology with 
erg = 0.9, 0.8, 0.7 and 0.6 (triangles with error bars, labelled by 
as)- These are compared to model bias fun ctions at z = 0.07, 1 
and 3 for the stellar mass distribution in the lDe Lucia fc Blaizotl 
(2007) galaxy formation model (solid lines labelled by redshift). 
Only for ag = 0.8 is the bias function inferred from the SDSS data 
as flat and as free from features at the l-halo/2-halo transition 
as the model functions. 



close to ag — 0.8 where the bias function is quite flat and the 
transition feature is almost absent. This accords quite well 
with the values of as suggested by joint anal ysis of WMAP5 
and l ow-redshift supernova and BAO data i|Komatsu et alJ 

l20ogh . 

If it can be shown that the smooth behaviour of the 
model bias functions of Fig.Qis generic to all physically rea- 
sonable ACDM galaxy formation models, then these results 
suggest a powerful cosmological test. Projected stellar mass 
correlation functions can be estimated as a function of red- 
shift from any large galaxy survey with sufficiently good pho- 
tometry to obtain robust and relatively precise photometric 
redshifts. The l-halo/2-halo feature typically occurs at radii 
where these correlation function estimates have their best 
signal-to-noise, and thus is much easier to measure than, 
say, the baryon oscillation feature. Requiring the bias func- 
tions to be smooth and nearly flat will determine as(z). In 
addition, the location of the transition feature in the indi- 
vidual correlation functions provides a length-scale which 
can be used to get an angular size distance to each redshift. 
If it works, this scheme thus provides measurements of both 
the functions of redshift which are needed to constrain the 
nature of Dark En ergy. 

Although the |Pe Lucia fc Blaizotl (|2007l ) galaxy cata- 
logue provides a surprisingly good fit to the observed clus- 
tering of stellar mass, it does much less well when com- 
pared to the stellar mass function of Fig. [T] The very small 
error bars highlight discrepancies which could be ignored 
when comparing to previously published and less precise 
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mass functions. The most important of these concerns the 
abundances at low mass, where the SDSS measurements are 
a factor of two below the galaxy formation models imple- 
mented on the Millennium Simulation over the full range 
8 < log M*/M < 9.5 (see Fig. BT1 below). Clearly, galaxy 
formation in low-mass halos was considerably less efficient in 
the real Universe than these models predict. Precise statis- 
tics for the distribution of stellar mass of the kind obtained 
in this paper provide hard constraints for galaxy formation 
models, and learning what is required to fit them properly 
may teach us much about the physics of galaxy formation. 
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APPENDIX A: SYSTEMATIC BIASES DUE TO 
THE STELLAR MASS DEFINITION 




lo g( M Blanton/ h " 2M 0) 





M 
O 

I 

O CO 
Ph 



GO ^ 
O 



LO 




M * = M Blanton 
"^"* ^"Kauffmann 

log(M,)=log(M Kauffmann )-0.1 



10 



log(M,/h 3 M ) 



11 



Figur e Al. Upper: Stellar mass estimates from 
Blanton &: Row eia J2007T) . MBianton i compared to those of 
KauffinamT^t*^,!.! ll2003h 7Mi<',,,ff m ,„„. as a function of M B i anton . 
The gray scale is the distribution of M K auffmann/A^Blanton at 
each value of M B i anton . The solid lines (from bottom to top) are 
the 10%, 25%, 50%, 75%, and 90% quantiles of this distribution. 
The dotted line is a fit to the median (see the text). Lower: Stellar 
mass functions for DR4 estimated using MBi an ton (symbols) and 
^Kauffmann (solid line). The dashed line is obtained from the 
solid line by shifting it by AlogAf = —0.1 



In order to expl ore possible systematics in our results due 
to the particular iBlanton fc Roweisl l|2007i ) stellar mass esti- 
mates we have used, we here repeat parts of our analysis us- 
in g both the spectro s copy-p hotometry-based stellar masses 
of iKauffmann et ahl (|2003T ). MKauffmami and the simple r 
colour-magnitude-based stellar masses of iBell et all {2003), 
MbbII • 



For MKauffmann we have had to go back to the Sample 
dr4 of NYU-VAGC, which is based on S PSS Data Release 
4 (DR4; lAdelman-McCarthv et al.ll2006l ). since M Kau fiinaiu> 
is not available for later releases. From Sample dr4 we have 
selected a sample of 300,596 galaxies using the same crite- 
ria as in § 12.11 Measurements of MKauffmann are taken from 
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Figu re A2. Upper: Stellar mass estimates from 
iBlanton &: Roweid d2007T) ■ Mmantoni compared to (r — i) — 
r based estimates from the formulae of lBell et al.l ||2003| ) . M B eii, 
as a function of MBianton- The gray scale is the distribution 
of M Bell /M Blanton at each value of Af Blanton . The solid lines 
(from bottom to top) are the 10%, 25%, 50%, 75%, and 90% 
quantiles. The dotted line is a fit to the median (see the text). 
Lower: Stellar mass functions for DR7 estimated using MBianton 
(symbols) and Meeii (solid line). The dashed line is obtained 
from the solid line by by shifting it by AlogAf = —0.05 



th e MPA/JHU SPSS PR4 databaseQ. The reader is referred 
to lKauffmann et all i|2003l ) for a detailed description of the 
methodology used to derive MKauffmann- In brief, the ampli- 
tude of the 4000-A break -D4000 and the strength of the US 
absorption line were obtained from stellar population syn- 

4 http:/ /www. mpa-garching.mpg.de/SDSS/DR4/ 
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SS 

S 



ID 
O 



0.01 



0.1 



10 



100 



r p [h ^pc] 



Figure A3. Ratio of the projected stellar mass autocorrelation 
function in the SDSS estimated using Mfleii to that estimated 

using A/ B lanton- 



thesis models for a library of 32,000 diverse star formation 
histories. A maximum-likelihood estimate of the z-band stel- 
lar mass-to-light ratio can then be estimated for each galaxy 
from its observed D4000 an d H<5 indices and applied to its 
observed z-band luminosity to estimate its stellar mass. 

The upper panel of Figure lATI shows a galaxy-by-galaxy 
comparison of MKauffmann to Af B i a nton, the stellar mass es- 
timate used in the main part o f this paper, as a functio n of 
MBianton- As already shown in lBlanton fc Roweisll2007l (see 
their Fig. 17), the two estimates are very similar, with a 
typical scatter of 0.1 dex and with offsets below 0.1 dex at 
all M B ianton- The median of log(M K auffmann/M B ianton), Am, 
as a function of m = log(AiBianton/ft-~ 2 M0) can be well rep- 
resented by a hyperbolic tangent function, 



Am = 0.0256 + 0.0478 tanh[(m - 9.73)/0.417], 



(Al) 



which we show as a dotted line in the figure. 

In the lower panel of Figure |A"T1 the stellar mass func- 
tion estimated from MKauffmann is plotted as a solid line, and 
is compared to that from MBianton shown as triangles. Both 
estimates are based on PR4 so the MBianton function dif- 
fers from that of Figure [T] Error bars on the latter show the 
ler scatter between 20 mock catalogues constructed from the 
Millennium Simulation using the same sky mask, magnitude 
and redshift limits as for the real PR4 sample, as described 
in § 12.21 The two mass functions are consistent with each 
other at masses below ~ 5 x 10 10 Mq. At higher masses, 
MKauffmann-based mass function lies above that based on 
MBianton- This difference almost disappears if the former 
is shifted to the left by AlogAf = — 0.1 as shown by the 
dashed line. With this small shift in mass scale, the differ- 
ence in abundance between the two determinations nowhere 
exceeds 0.1 dex. 

We now consider M Be il- Here we use the PR7 data as in 
the main text and we compute Mseii for each galaxy from 
its r — i colour and its r-b and luminosity using the formulae 
given bv lBell et al.l (120031. see the ir Appendix 2 and Table 7) 
for a Kroupa IMF (lKroupa|l200ll ). This is the IMF adopted 
by iKauffmann et al.l 1 2005 ) and it is quite similar to the 
Chabrier IMF assumed by IBlanton fc Rowed (120071 ). 

Figure lA"2l is identical in format to Figure lATI The up- 
per panel is a direct galaxy-by-galaxy comparison of Msell 
and MBianton, while the lower panel compares PR7 mass 
functions obtained with the two stellar mass estimates The 
symbols and lines have the same meaning as in the previous 
figure. The median mass ratio MBeli/Afeianton is almost in- 



dependent of M B 



at masses above ~ 10 h Mq, but 
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it increases substantially at lower masses, reaching 0.3 dex 
at 10 s /i -2 Mq. This behaviour can be modelled by a quartic 
function, 

Am = 2.0-0.043m-0.045m 2 +0.0032m 3 -2.1xl0~ 5 x 4 , (A2) 

where m = log(A/Bianton/fr _2 Af0) and Am = 
log(M Bc ii/M B ianton). With a scale shift of A log M = -0.05, 
the stellar mass function based on A/bcII is a good match to 
that based on M B ianton at masses above W 10 h~ 2 Mq, but is 
high by ~ 0.1 dex at lower masses. 

In summary, differences between these three observa- 
tional estimators of stellar mass affect our mass function 
determination primarily through small off-sets in the mass 
scale. Once this is taken into account, abundance offsets be- 
tween the different estimates are quite small across the full 
range 10 s h~ 2 M Q < M < 10 n ' 5 /i" 2 M o that we consider. As 
we now demonstrate, effects on our stellar mass autocorre- 
lation function estimate are much smaller. 

In Figure 1X31 we plot the ratio of the stellar mass auto- 
correlation function computed using A/bc11 to that computed 
using A/Bianton- The two w p (r p ) measurements are indistin- 
guishable on scales above about 30 kpc. For r p < 30 kpc, the 
amplitude of the Medi-based correlation function is slightly 
higher, but still within the error bars of the measurements. 
This can be understood from the fact that, as shown in 
Figure [5J the mass autocorrelations are dominated by con- 
tributions from galaxies in a relatively narrow mass range 
where Figures [A 1 1 and [A2I show the scatter between the var- 
ious estimators to be small. Any scale off-set between them 
drops out in the definition of the autocorrelation function. 
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Figure Bl. The 1-ct confidence region for our DR7 stellar mass 
function (median redshift 0.09) is plotted as a shaded region and is 
compared to the stellar ma s s func tion for DR4 galaxies at z < 0.05 
presented bv lBaldrv et al.l l|200Sft (filled circles). The stellar mass 
function for z = galaxies in the Millennium Simulation galaxy 
formation model of |Pe Lucia fc Blaizo t (2007) is also shown as 
a solid line. Error bars on some of the circles show the 1<t scat- 
ter between 20 mock catalogues constructed from the Millennium 
Simulation using the s ame sky mask, mag nitude and redshift lim- 
its as in the sample of lBaldrv et al.l <2008t) . Clearly the two SDSS 
mass functions agree well. 



APPENDIX B: INCOMPLETENESS FOR LOW 
SURFACE BRIGHTNESS GALAXIES 



As discussed by iBlanton et al.l (1200581 ) the SDSS galaxy 
samples are incomplete for low surface brightness (SB) ob- 
jects. This affects a negligible fraction of high-mass galaxies 
but can become serious for dwarfs. Thus one may be con- 
cerned that our stellar mass fu nctions under e stima te the 
abundance of low-mass galaxies. iBaldrv et ail (2008) have 
examined this issue in some detail, deriving the bivariate 
distribution of SDSS galaxies in SB and stellar mass. They 
found that for the stellar masses studied here, the distribu- 
tion of log(SB) is approximately gaussian at given stellar 
mass, with a mean which decreases linearly with log (A/,) 
over the range 10 s ' 5 to lO n Af0 and a scatter which increases 
slightly towards lower masses (see their Fig. 4). As stellar 
mass decreases, the fraction of galaxies with SB below the 
SDSS completeness limit thus steadily increases. Neverthe- 
less, at the lower limit to w hich we plot o ur ma ss function 
(log(M„) = 8.3 in Fig. 4 of lBaldrv etafl (|2008h since they 
adopt h — 0.7) the estimated completeness is still well above 
70%. We therefore expect incompleteness to have at most a 
minor effect on our results. 

In Figure [Bll we compar e our DR7 ste l lar ma ss function 
explicitly to the one which IBaldrv et akl ||2008|) estimated 
for DR4 galaxies at z < 0.05. IBaldrv et al.l (|2008l ) used stel- 
lar masses estimated as the avera ge of those obtained by 
four different published methods dKauffmann et al.l 2003; 
Glazebrook et al.l 12004 iGallazzi et al.1 120051 ; iPanter et all 
20071 ). In the mean this should give a mass scale close to that 



of IBlanton fc Roweisl l|2007l ). Beca use of the substant ially 
smaller volume of their sample, the IBaldrv et akl (2008) re- 
sults are substantially noisier than our own. (In order to 
include cosmic variance effects, we have derived the error 
bars in the figure from mock catalogues in a similar way 
to those shown on our own mass functions in the main 
body of our paper.) Agreement is very good over the full 
mass range plotted. Furthermore, the analysis in their pa- 
per shows that SB incompleteness affects only the two or 
three lowest mass points plotted in Figure |BT1 and so is neg- 
ligible for our purposes. This figure also shows the stellar 
mass function of the Millennium Simula tion galaxy forma- 
tion model of iDe Lucia fc BlaizotJ (|20Q7h . This agrees well 
with the observations around the knee of the function, but it 
predicts the rarest and most massive galaxies to have stellar 
masses 0.2 dex larger than those estimated in SDSS, and, as 
noted in the final paragraph of our main text, it predicts an 
abundance of low-mass galaxies which is substantially larger 
than observed. 

Finally we note that SB incompleteness has no effect on 
the stellar mass autocorrelations that we estimate because, 
as we show in Figure[S] these are dominated by contributions 
from galaxies of substantially larger stellar mass. 

This paper has been typeset from a TpX/ DTpX file prepared 
by the author. 
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